Add dtype parameter to kspaceFirstOrder() (#695) by waltsims · Pull Request #716 · waltsims/k-wave-python

waltsims · 2026-05-04T04:03:33Z

Closes #695.

What

Exposes precision control on the modern unified API. Pythonic / numpy-idiomatic naming and accepted input forms:

input	resolved precision
`None` (default)	`np.float64`
`np.float64` / `"float64"` / `"double"` / `float` / `np.dtype("f8")`	`np.float64`
`np.float32` / `"float32"` / `"single"` / `np.dtype("f4")`	`np.float32`
`"off"` (legacy MATLAB alias)	`np.float64`
anything else (`np.float16`, `np.complex64`, `"quad"`, …)	`ValueError`

The MATLAB aliases ("off", "single", "double") are kept as compat shortcuts for users porting from the legacy SimulationOptions.data_cast or MATLAB k-Wave's DataCast. Everything else uses np.dtype() for normalization, matching the broader numpy/scipy/torch convention.

Why `dtype` instead of `data_cast`

data_cast is a MATLAB term. The numpy ecosystem (numpy, pandas, jax, torch) uses dtype and accepts dtype-like inputs broadly. The modern unified API is a fresh design — it should follow the Python idiom rather than the MATLAB one. The MATLAB-style strings still work, so MATLAB users lose nothing.

How

Python backend plumbs dtype through Simulation, which now stores self._dtype and uses it for every state-array allocation: p, u, rho_split, sensor-data buffers, PML arrays, source signal buffers, and the _expand_to_grid helper for sound_speed / density / alpha_coeff / BonA / p0. Default behavior unchanged (float64 everywhere).

C++ backend intentionally has no effect — the binary uses fixed internal precision regardless of HDF5 input dtype. Setting dtype to anything other than np.float64 with backend='cpp' emits a UserWarning explaining this and pointing users at backend='python' for precision control.

Test plan

New file tests/test_data_cast.py (22 tests):

test_python_backend_float64_inputs parametrized over [None, np.float64, "float64", "double", float, "off", np.dtype("f8")] — every form resolves to float64 output
test_python_backend_float32_inputs parametrized over [np.float32, "float32", "single", np.dtype("f4")] — every form resolves to float32 output
test_default_dtype_is_float64 — calling without the kwarg gives float64 (back-compat)
test_invalid_dtype_raises parametrized over [np.float16, np.complex64, "float16", "complex64", "quad", 42, "garbage"] — all raise ValueError
test_python_single_vs_double_numerical_agreement — single and double runs agree to within 1e-4 relative error
test_cpp_backend_warns_on_non_float64_dtype — UserWarning fires before binary runs
test_cpp_backend_silent_on_default_dtype — no warning on default

Wider suite verified (62 tests): test_native_solver, test_ivp_homogeneous_medium, test_issue_664_alpha_power_near_unity all pass.

CI green

Greptile Summary

This PR adds a dtype parameter to kspaceFirstOrder() that lets callers control state-array precision for the Python backend (np.float32 or np.float64), with MATLAB-style string aliases ("single", "double", "off") kept for migration compatibility. The C++ backend correctly ignores the parameter and emits a UserWarning when a non-float64 value is passed.

_resolve_dtype normalises every dtype-like input form and raises ValueError with framework-specific hints for torch/jax objects. Default behaviour (float64 everywhere) is unchanged.
Simulation now stores self._dtype and self._complex_dtype, plumbed through field allocation, PML arrays, k-vectors, sensor buffers, and source operators; several FFT round-trips gained explicit .astype() casts.
Several numpy<2 scalar-promotion paths flagged in prior review rounds remain open: kappa/source_kappa not cast back to self._dtype; dt_over_rho0 still float(self.dt) / float32_rho; self.dt * self.rho0 on the rho_split update likewise uncast. On numpy>=2 all tests pass; on numpy 1.x these silently produce float64 arrays even when dtype=np.float32.

Confidence Score: 3/5

Safe to merge for numpy>=2 environments; float32 precision is not reliably enforced on numpy 1.x due to several uncast Python-scalar multiplications identified in prior review rounds.

The dtype-plumbing is thorough for array allocations, PML, k-vectors, and sensor buffers, and the FFT cast-back guards prevent most output-dtype drift. However, kappa/source_kappa construction (Python-float c_ref times float32 k_mag), dt_over_rho0 (Python float divided by float32), and the rho_split update (self.dt * self.rho0) still produce float64 intermediates on numpy 1.x, silently defeating the float32 promise on that platform.

kwave/solvers/kspace_solver.py — _setup_kspace_operators (kappa/source_kappa), dt_over_rho0 precomputation in _setup_fields, and rho_split update in step() all have uncast Python-scalar multiplications that defeat float32 precision on numpy 1.x.

Important Files Changed

Filename	Overview
kwave/kspaceFirstOrder.py	Adds _resolve_dtype helper and dtype parameter to kspaceFirstOrder(); correctly warns when dtype != float64 with cpp backend.
kwave/solvers/kspace_solver.py	Plumbs _dtype/_complex_dtype through allocations and FFT casts; several numpy<2 scalar-promotion paths (kappa, dt_over_rho0, rho_split update) remain unresolved from prior rounds.
tests/test_data_cast.py	Comprehensive new test file covering all input forms, invalid inputs, numerical agreement, staggered velocity, BonA path, and C++ backend warnings.

Comments Outside Diff (7)

kwave/solvers/kspace_solver.py, line 663 (link)

dt_over_rho0 computed as Python float / float32 → silently float64 on numpy < 2

self.dt is stored as float(self.kgrid.dt) (a Python float, equivalent to float64). Dividing by a float32 array gives a float64 result under numpy < 2 (NEP 50 changed this in numpy 2.0). As a consequence, on every step(): self.dt_over_rho0[i] * grad_p_i (line 706) is float64 × float32 → float64, so self.u[i] is rebound to a float64 array after the very first step. The same Python-scalar promotion also affects line 716 (self.dt * self.rho0 * div_u_i * nl_factor), so rho_split[i] and, through rho_total, self.p also become float64. The sensor-data buffer (sensor_data["p"]) is pre-allocated float32 and silently narrows values on in-place assignment, so result["p"] tests pass, but result["p_final"] (line 773) is self.p[interior].copy() — no narrowing — and will be float64 even when dtype=np.float32 is requested, breaking the dtype contract on numpy < 2.
kwave/solvers/kspace_solver.py, line 429-447 (link)

Complex k-space operators still promote to complex128 on numpy < 2

k_list entries are now correctly cast to self._dtype (float32 when requested), but self.c_ref and self.dt are Python floats (float64), so self.c_ref * k_mag * self.dt / 2 is float64 on numpy < 2, making kappa and source_kappa float64. The 1j * Python complex literal then forces op_grad_list and op_div_list to complex128 rather than complex64. Cast kappa/source_kappa to self._dtype and the final operators to self._complex_dtype after construction.
kwave/solvers/kspace_solver.py, line 661-663 (link)

self.dt is stored as float(self.kgrid.dt) — a Python float, which numpy < 2.0 (pre-NEP 50) treats as np.float64 in type promotion. Dividing a float64 scalar by a float32 array yields float64 on numpy 1.x, so dt_over_rho0 is float64 even when self._dtype is np.float32. On the first step(), self.dt_over_rho0[i] * grad_p_i (float64 × float32 → float64) and the outer pml_sg * (...) product make self.u[i] float64. The same Python-scalar promotion on line 716 (self.dt * self.rho0 * div_u_i) then forces self.rho_split[i] to float64, which propagates through _array_sum(rho_split) into self.p. Because result["p_final"] is a direct .copy() of self.p (no pre-allocated float32 buffer to narrow into), it will be float64 on numpy < 2 even when dtype=np.float32 — causing test_python_backend_float32_inputs to fail on numpy 1.x. Pre-cast dt to self._dtype at setup time and reuse it in step().
kwave/solvers/kspace_solver.py, line 716 (link)

Same Python-scalar promotion issue: self.dt is a Python float (float64), so self.dt * self.rho0 (float64 × float32) → float64 on numpy < 2. nl_factor = 1.0 (Python float) compounds this in the linear path. The result is rho_split[i] ends up as float64, which propagates through rho_total into self.p, making p_final float64 regardless of self._dtype. Replace self.dt with self._dt_typed (the dtype-cast scalar computed during setup).
kwave/solvers/kspace_solver.py, line 663 (link)

dt_over_rho0 is computed by dividing a Python float (self.dt = float(kgrid.dt)) by a float32 array. On numpy < 2 (pre-NEP 50), Python scalars are strong np.float64, so self.dt / rho yields a float64 result for each element of the list. At line 706, self.dt_over_rho0[i] * grad_p_i (float64 × float32) then rebinds self.u[i] to float64 — there is no .astype() guard on that assignment, unlike the _diff() return path. u_final (line 777) is then self.u[i][interior].copy(), which will be float64 even when dtype=np.float32.
kwave/solvers/kspace_solver.py, line 716 (link)

self.dt is a Python float (float64). On numpy < 2, float64 × float32 is a strong-type promotion to float64, so self.dt * self.rho0 * div_u_i * nl_factor evaluates to float64 and rebinds self.rho_split[i] to a float64 array — there is no .astype() narrowing guard here. Float64 rho_split entries then flow through _array_sum(self.rho_split) → self.p, making p_final float64 even when dtype=np.float32.
kwave/solvers/kspace_solver.py, line 429-431 (link)

self.c_ref and self.dt are both Python float (= float64). On numpy < 2, float64 * float32_array is a strong promotion, so self.c_ref * k_mag * self.dt / 2 produces a float64 array even though k_mag was cast to self._dtype. kappa and source_kappa are therefore float64, which in turn forces op_grad_list / op_div_list to complex128 (Python 1j is also a strong complex128 on numpy < 2). Every call to _diff() then runs the FFT round-trip in float64 arithmetic even when dtype=np.float32 was requested, defeating the purpose of the precision parameter.

_{Reviews (10): Last reviewed commit: "Merge branch 'master' into feature-data-..." | Re-trigger Greptile}

Exposes precision control on the modern unified API to match the legacy SimulationOptions.data_cast and MATLAB k-Wave's DataCast. data_cast='off' -> np.float64 (default; matches legacy) data_cast='double' -> np.float64 (alias for 'off', MATLAB compat) data_cast='single' -> np.float32 (~half memory, faster, lower accuracy) Python backend: plumbs through Simulation, which now stores self._dtype and uses it for all state arrays (p, u, rho_split, sensor_data buffers, PML arrays, alpha_coeff/BonA/p0 expansions, source signal buffers). Default behavior unchanged (float64 everywhere). C++ backend: data_cast has no effect — the binary uses fixed internal precision regardless of HDF5 input dtype. Setting anything other than 'off'/'double' with backend='cpp' emits a UserWarning explaining this and pointing users at backend='python' for precision control. Tests: 8 new in tests/test_data_cast.py covering output dtype matches request, default behavior unchanged, invalid value raises, single vs double numerical agreement within float32 tolerance, and the C++ warn/silent paths. Wider suite (62 tests across native_solver, ivp_homogeneous, issue_664) still passes. Closes #695. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-04T04:07:31Z

Codecov Report

❌ Patch coverage is 83.82353% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.34%. Comparing base (7260829) to head (87d2a91).
⚠️ Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
kwave/solvers/kspace_solver.py	77.55%	10 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #716      +/-   ##
==========================================
+ Coverage   75.04%   75.34%   +0.29%     
==========================================
  Files          57       57              
  Lines        8128     8164      +36     
  Branches     1584     1593       +9     
==========================================
+ Hits         6100     6151      +51     
+ Misses       1405     1392      -13     
+ Partials      623      621       -2

Flag	Coverage Δ
3.10	`75.31% <83.82%> (+0.29%)`	⬆️
3.11	`75.31% <83.82%> (+0.29%)`	⬆️
3.12	`75.31% <83.82%> (+0.29%)`	⬆️
3.13	`75.31% <83.82%> (+0.29%)`	⬆️
macos-latest	`75.19% <83.82%> (+0.22%)`	⬆️
ubuntu-latest	`75.24% <83.82%> (+0.26%)`	⬆️
windows-latest	`75.09% <83.82%> (+0.18%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Make the precision parameter Pythonic instead of MATLAB-stringly-typed. The numpy ecosystem's convention is to accept dtype-like inputs broadly (numpy types, strings, dtype objects), and the modern API should follow that idiom rather than the legacy SimulationOptions.data_cast naming. Accepted forms (resolved via _resolve_dtype, which uses np.dtype()): None / np.float64 / "float64" / "double" / float / "off" / np.dtype("f8") -> np.float64 (default) np.float32 / "float32" / "single" / np.dtype("f4") -> np.float32 The MATLAB aliases ("off", "single", "double") are kept as compat shortcuts so users porting from the legacy API or MATLAB k-Wave have zero friction. Anything resolving to a non-float32/float64 type (np.float16, np.complex64, etc.) raises ValueError -- the solver isn't validated for those. C++ backend warns when dtype is not np.float64 (binary uses fixed internal precision regardless). Tests: 22 (was 8) parametrized over every input form. Wider suite (62 tests) still passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

waltsims · 2026-05-04T04:11:41Z

@greptile-apps re-review

Greptile P2 (test for p_final dtype) caught a real bug: with dtype='single', p_final came back as float64 even though sensor_data buffers (p, p_max, p_min, p_rms) were correctly float32. Root cause: two sources of float64 leaking into the hot loop: 1. xp.fft.fftfreq returns float64; k_list, kappa, op_grad/div_list, _k_mag inherited it. _diff's FFT round-trip (float64 op * complex64 field) upcasts to complex128, .real => float64. Result: self.p and self.u rebound to float64 mid-step() despite being allocated as float32. p_final = self.p[interior].copy() picked up float64. sensor_data buffers stayed float32 because writes are in-place into the pre-allocated buffer (silent narrowing on assignment). 2. get_pml returns float64 unconditionally; the per-step pml multiply was a second upcast path independent of (1). Both cast sites now apply .astype(self._dtype) at construction time, keeping the entire compute pipeline in the user's requested precision. Test updated: float32 / float64 input parametrizations now request ('p', 'p_final', 'p_max', 'p_min', 'p_rms') and assert every field's dtype matches. Verified: float32 inputs => all five fields float32; float64 => all five float64. Bonus: helpful error for torch / jax / tensorflow dtype objects via duck-typed __module__ check (no framework imports needed); cupy works for free since cp.float32 is np.float32. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…odern-api # Conflicts: # kwave/solvers/kspace_solver.py

Greptile spotted a third dtype-drift path: ``sum(rho_split)`` in ``_nl_factor`` and the equation-of-state line starts with Python ``int 0``. Under numpy < 2 (NEP 50), ``int + float32 -> float64``, so: nl_factor = (2 * sum(rho_split) + rho0) / rho0 is float64 even when rho_split is float32. The product ``rho0 * div_u_i * nl_factor`` in mass conservation then upcasts the rho_split arrays to float64 on the very first step. Specifically affects any simulation that enables BonA. Fix: ``_array_sum`` helper that starts the accumulator from ``arrays[0]`` so the dtype is preserved. Used in both call sites (_nl_factor lambda and equation-of-state rho_total). Test added: test_python_backend_dtype_preserved_with_nonlinearity exercises the BonA path with dtype=np.float32 and asserts p / p_final / p_max all remain float32. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

waltsims · 2026-05-04T04:24:43Z

@greptile-apps re-review

End-to-end verification under numpy 1.26.4 (with my dtype tests requesting ('p', 'p_final', 'p_max', 'p_min', 'p_rms')) showed self.p still upcast to float64 mid-step despite all the prior precision fixes. Root cause: numpy < 2's `np.fft.fftn` always returns complex128 regardless of input precision -- a known difference resolved in numpy 2 (NEP 50 era). The k-space ops (op_grad/div_list, unstagger_ops) being complex64 isn't enough; multiplying complex64 by complex128 upcasts to complex128, and .real -> float64 propagates back into self.p. Fix: introduce self._complex_dtype (complex64 / complex128 matching self._dtype), and cast every fftn / ifftn .real result back to the intended precision. Three call sites: step() momentum loop, sensor-data unstagger, _diff helper. ``copy=False`` makes the cast a no-op when the dtype already matches (numpy 2, cupy). Greptile flagged dt_over_rho0, kappa, and sum() as P1 dtype-promotion paths. Verified empirically on numpy 1.26.4 that none of those upcast in practice (Python float / int / complex are weak types in numpy 1.x as well) -- but the test failures Greptile would have seen had the same visible signature, so the diagnosis pointed at the right region. Tests: 86 (24 dtype + 62 wider suite) pass on numpy 1.26.4 AND numpy 2.2.6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

waltsims · 2026-05-04T04:51:26Z

Greptile flagged dt_over_rho0, kappa / op_grad/div_list, and sum() as P1 dtype-promotion paths under numpy<2. Verified empirically on numpy 1.26.4 (with NEP 50 disabled): none of those promote in practice — Python float/int/complex are weak types in numpy 1.x as well as 2.x.

But Greptile was right that the diff was unsafe on numpy<2 — the actual culprit was different: np.fft.fftn always returns complex128 in numpy<2, regardless of input precision (a known difference resolved in numpy 2). My earlier complex64 op_grad/div_list weren't enough: complex64 op × complex128 P → complex128, .real → float64 propagating back into self.p.

Fixed in 22ec78d: self._complex_dtype (complex64/complex128) + cast every fftn/ifftn .real result back to the intended precision (3 call sites). Tests now pass on numpy 1.26.4 AND numpy 2.2.6 (86 each).

So same conclusion (not safe on numpy<2 before this commit), different mechanism. Re-triggering review now.

@greptile-apps re-review

greptile-apps · 2026-05-04T04:58:56Z

        # Spectral shift: move velocity from staggered (mid-cell) to collocated (pressure) grid
        self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2) for ax in range(self.ndim)]


unstagger_ops also built with 1j * literal → always complex128 on numpy < 2

The same Python-literal promotion issue as op_grad_list/op_div_list applies here: -1j * self.k_list[ax] is complex128 on numpy < 2, regardless of k_list dtype. Cast to self._complex_dtype after construction to keep the unstaggering FFT in the correct precision.

Suggested change

# Spectral shift: move velocity from staggered (mid-cell) to collocated (pressure) grid

self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2) for ax in range(self.ndim)]

# Spectral shift: move velocity from staggered (mid-cell) to collocated (pressure) grid

self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2).astype(self._complex_dtype) for ax in range(self.ndim)]

Verified empirically on numpy 1.26.4: -1j * k_f32 * spacing / 2 evaluates to complex64 and np.exp(...) preserves complex64 — Python complex is a weak type just like Python float/int, so the multiplication chain stays in complex64 when k_list is float32. So unstagger_ops is already complex64 without an explicit cast.

Additionally, the prior commit (22ec78d) already added .astype(self._dtype, copy=False) to the staggered ifftn .real line in step(), so the recorded output stays float32 even if unstagger_ops were ever to change precision.

Added a regression test (d545952) that records ('p', 'ux_staggered', 'uy_staggered') with dtype=np.float32 and asserts each is float32 — locks the behavior in for both numpy 1.26.4 and 2.2.6.

@greptile-apps re-review

The empirical result doesn't match what NEP 50's own migration table documents. The NEP 50 spec explicitly shows the old (numpy < 2) vs new (numpy ≥ 2 / NEP 50) behaviour:

Expression numpy < 2 (old) numpy ≥ 2 (NEP 50)

(3j + array(3, complex64)).dtype complex128 complex64

(float32(1) + 1j).dtype complex128 complex64

Source: https://numpy.org/neps/nep-0050-scalar-promotion.html

Python complex literals are not weak types in numpy < 2 — they behave like np.complex128 in the type promotion system. The "weak scalar" treatment is exactly what NEP 50 introduced. So on numpy 1.26.4, -1j * k_f32 * spacing / 2 → complex128, and np.exp(...) of a complex128 array → complex128.

It's possible the empirical test ran against a numpy 2.x environment, or that k_list[ax] was already float64 at the point of measurement (defeating the test). The regression test added in d545952 checks recorded sensor_data["p"] and "ux_staggered" dtypes, but those go through the .astype(self._dtype, copy=False) guard added in step() — so they'll pass regardless of what unstagger_ops itself is. The dtype of unstagger_ops isn't directly asserted.

The fix remains: cast the list comprehension result to self._complex_dtype:

Suggested change

# Spectral shift: move velocity from staggered (mid-cell) to collocated (pressure) grid

self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2) for ax in range(self.ndim)]

self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2).astype(self._complex_dtype) for ax in range(self.ndim)]

Tip: You can give me more context with .greptile/rules.md so I get this right next time.

Greptile flagged unstagger_ops as a P1 (claimed 1j * k_f32 -> complex128 on numpy<2). Empirically that's not true under numpy 1.26.4 -- Python complex is a weak type just like Python float/int, so unstagger_ops stays complex64 when k_list is float32. The recorded ux_staggered output also stays float32 because the prior commit already cast the unstagger ifftn .real result to self._dtype. Adding a test that records ('p', 'ux_staggered', 'uy_staggered') and asserts each is float32 under dtype=np.float32 -- guards against any future regression in the staggered-velocity path on either numpy major version. 25 tests pass on numpy 1.26.4 and numpy 2.2.6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

waltsims · 2026-05-04T05:03:26Z

@greptile-apps re-review

waltsims · 2026-05-16T16:53:33Z

@copilot resolve the merge conflicts in this pull request

Co-authored-by: waltsims <8669206+waltsims@users.noreply.github.com>

Copilot · 2026-05-16T16:56:52Z

@copilot resolve the merge conflicts in this pull request

Resolved. I merged origin/master into this branch and fixed the conflict in kwave/solvers/kspace_solver.py, preserving the dtype-safe _array_sum(...) path. Commit: 6920efb.

waltsims · 2026-05-16T17:06:19Z

@greptile re-review

waltsims · 2026-05-16T17:12:41Z

Closes #338

greptile-apps · 2026-05-16T17:21:34Z

            self._nonlinearity = lambda rho: self.BonA * rho**2 / (2 * self.rho0)
-            self._nl_factor = lambda rho_split: (2 * sum(rho_split) + self.rho0) / self.rho0
+            self._nl_factor = lambda rho_split: (2 * _array_sum(rho_split) + self.rho0) / self.rho0


int 2 literal re-introduces dtype promotion on numpy < 2

_array_sum was added to prevent sum(rho_split) starting from Python int 0, but 2 * _array_sum(rho_split) still multiplies by a Python int — which numpy < 2 (pre-NEP 50) treats as np.int64 (a strong type). np.result_type(np.int64, np.float32) → np.float64, so nl_factor is float64 when BonA is enabled on numpy 1.x, and propagates through rho_split[i] into self.p → p_final. The same int 2 divisor in _nonlinearity (2 * self.rho0) causes the same promotion for the equation-of-state nonlinear term. On numpy >= 2 (NEP 50 weak scalars) the tests pass, but on numpy 1.26 they will silently produce float64 p_final instead of float32.

Suggested change

self._nonlinearity = lambda rho: self.BonA * rho**2 / (2 * self.rho0)

self._nl_factor = lambda rho_split: (2 * sum(rho_split) + self.rho0) / self.rho0

self._nl_factor = lambda rho_split: (2 * _array_sum(rho_split) + self.rho0) / self.rho0

_two = self.xp.asarray(2, dtype=self._dtype)

self._nonlinearity = lambda rho: self.BonA * rho**2 / (_two * self.rho0)

self._nl_factor = lambda rho_split: (_two * _array_sum(rho_split) + self.rho0) / self.rho0

waltsims · 2026-05-16T17:22:18Z

Greptile's dt_over_rho0 / rho_split claim is empirically false on numpy 1.26.4

Greptile's review claims that self.dt / rho (line 663) and self.dt * self.rho0 * div_u_i * nl_factor (line 716) produce float64 on numpy < 2 due to Python scalar promotion. This is wrong — Python float scalars have weak typing in numpy 1.x and do NOT upcast arrays.

Direct verification with the freshly-installed environment (numpy 1.26.4):

import numpy as np
dt = 1.23e-9  # Python float (what self.dt is)
rho = np.full((4,4), 1000.0, dtype=np.float32)
print((dt / rho).dtype)                              # float32  (line 663)

rho0 = np.full((4,4), 1000.0, dtype=np.float32)
div_u_i = np.full((4,4), 0.5, dtype=np.float32)
nl_factor = 1.0
print((dt * rho0 * div_u_i * nl_factor).dtype)       # float32  (line 716, linear)

rho_split = [np.full((4,4), 0.1, dtype=np.float32) for _ in range(2)]
nl_factor_nl = (2*rho_split[0] + 2*rho_split[1] + rho0) / rho0
print((dt * rho0 * div_u_i * nl_factor_nl).dtype)    # float32  (line 716, nonlinear)

The full p_final dtype regression is already pinned by test_python_backend_float32_inputs in tests/test_data_cast.py (records p, p_final, p_max, p_min, p_rms and asserts each is float32 when dtype=np.float32). On numpy 1.26.4 the full 25-test suite passes including this one. If p_final were genuinely upcasting to float64 here, that test would fail.

NEP 50 changed numpy scalar dtype rules (np.int_(1)-style), not Python's built-in float weak-typing behaviour, which has been stable for years. Greptile is conflating the two.

waltsims · 2026-05-16T17:24:21Z

@greptile re-review

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Comment thread kwave/solvers/kspace_solver.py Outdated

Comment thread tests/test_data_cast.py

waltsims changed the title ~~Add data_cast parameter to kspaceFirstOrder() (#695)~~ Add dtype parameter to kspaceFirstOrder() (#695) May 4, 2026

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Comment thread kwave/solvers/kspace_solver.py Outdated

waltsims and others added 3 commits May 4, 2026 04:20

Merge remote-tracking branch 'origin/master' into feature-data-cast-m…

815dc8b

…odern-api # Conflicts: # kwave/solvers/kspace_solver.py

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

This was referenced May 16, 2026

[Example] Is gpuArray-single available as a data cast? #546

Closed

[ENHANCE] Add a parameter to kspaceFirstOrder() to control the data type that variables are cast to before computation #695

Closed

Copilot started work on behalf of waltsims May 16, 2026 16:53 View session

Merge origin/master and resolve kspace_solver conflict

6920efb

Co-authored-by: waltsims <8669206+waltsims@users.noreply.github.com>

Copilot finished work on behalf of waltsims May 16, 2026 16:57

Merge branch 'master' into feature-data-cast-modern-api

d8537f1

greptile-apps Bot reviewed May 16, 2026

View reviewed changes

Merge branch 'master' into feature-data-cast-modern-api

693cb46

Merge branch 'master' into feature-data-cast-modern-api

87d2a91

waltsims merged commit aaab343 into master May 16, 2026
6 of 7 checks passed

waltsims deleted the feature-data-cast-modern-api branch May 16, 2026 17:38

		# Spectral shift: move velocity from staggered (mid-cell) to collocated (pressure) grid
		self.unstagger_ops = [xp.exp(-1j * self.k_list[ax] * self.spacing[ax] / 2) for ax in range(self.ndim)]

Expression	numpy < 2 (old)	numpy ≥ 2 (NEP 50)
`(3j + array(3, complex64)).dtype`	`complex128`	`complex64`
`(float32(1) + 1j).dtype`	`complex128`	`complex64`

Conversation

waltsims commented May 4, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why dtype instead of data_cast

How

Test plan

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Comments Outside Diff (7)

Uh oh!

codecov Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

waltsims commented May 4, 2026

Uh oh!

Uh oh!

waltsims commented May 4, 2026

Uh oh!

waltsims commented May 4, 2026

Uh oh!

greptile-apps Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

waltsims May 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

waltsims commented May 4, 2026

Uh oh!

waltsims commented May 16, 2026

Uh oh!

Copilot AI commented May 16, 2026

Uh oh!

waltsims commented May 16, 2026

Uh oh!

waltsims commented May 16, 2026

Uh oh!

greptile-apps Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

waltsims commented May 16, 2026

Greptile's dt_over_rho0 / rho_split claim is empirically false on numpy 1.26.4

Uh oh!

waltsims commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

waltsims commented May 4, 2026 •

edited by greptile-apps Bot

Loading

Why `dtype` instead of `data_cast`

codecov Bot commented May 4, 2026 •

edited

Loading